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Abstract 


Let  G  be  a  bounded  convex  set,  and  fyj  the  projection  onto  G,  and 
(tj)  a  bounded  random  process.  Projected  algorithms  of  the  types  X£+l  * 
nG(X‘  +  c  b(X«,  *B))  (or  Xn+1  -  nG(XB  +  anb(XB,*B)),  where  0  <  aB  0,  £aB 
-  •)  occur  frequently  in  applications  (among  other  places)  in  control  and 
communications  theory.  The  asymptotic  convergence  properties  of  {X*}  as 
c  -*  0,  «n  -*  •,  have  been  well  analyzed  in  the  literature.  Here,  we  use  large 
deviations  methods  to  get  a  more  thorough  understanding  of  the  global 
behavior.  Let  e  be  a  stable  point  of  the  algorithm  in  the  sense  that  X£  -* 
e  in  distribution  as  c  -*  0,  nc  >•  •.  For  the  unconstrained  case,  rate  of 
convergence  results  involve  showing  asymptotic  normality  of  {(Xjj  -  e)/V<F), 
and  use  linearizations  about  a  In  the  constrained  case  e  is  often  on  oG, 
and  such  methods  are  inapplicable.  But  the  large  deviations  method  yields  an 
alternative  which  is  often  more  useful  in  the  applications.  The  action 
functionals  are  derived  and  their  properties  (lower  semicontinuity,  etc.)  are 
obtained.  The  statistics  (mean  value,  etc.)  of  the  escape  times  from  a 
neighborhood  of  e  are  obtained,  and  the  global  behavior  on  the  infinite 


interval  is  described. 


1 


V 

/ 

I.  Introduction 

Let  qj(  ■ ),  i  <  k,  be  continuously  differentiable  and  let  G  -  (x:  q;(x) 
<  0,  i  <  k}  be  a  compact  convex  set  which  is  the  closure  of  its  interior.  Let 
{(„}  be  a  bounded  sequence  of  random  variables  and  b(-,-)  a  bounded 
function  with  b(-,()  uniformly  (in  ()  Lipschitz  continuous.  Define  I^j(x) 

to  be  the  (nearest  point)  projection  of  x  onto  G. 

- )  /  /  V  ■„ <r. ■  .  ■  -  "  ^  ~  re  u  > 

The  projected  recursive  (or  stochastic  approximation)  algorithm  -(W-)- 

arises  frequently  in  applications  in  control  and  communications  theory. 

_ _ _  J 

x‘+1  -  nG(X‘  +  «b(X‘ftn)),  x  «  Rr,  X«  -  x#  given 


^There  is  a  sizeable  literature  (e.g.,  [1]  to  [S])  concerning  its  asymptotic 
properties  as\  e  -•  o  with  en  •*  t  or  cn  -»  •.  Often  c  is  replaced  by  a 

’stochastic  approximation’  sequence  (a^  with  a„  -*  o,  an  >  o,  and  £a,j  -  •. 

/ 

The  methods  of  analysis  are  similiar  in  both  cases,  except  that  the  latter  case 

(an  -»  o)  allows  the  possibility  of  w.p.l  convergence  of  (X,,). 

/ 

Typical  results  are  the  following.  For  a  velocity  vector  v,  define  the 

projection  of  v  at  x  e  G  by  Hj(x,v)  ■  lim  [I1g(x  +  Av)  *  *J/A  and  write 

a 

b  (x)  «  Eb(x,()  (b(  )  to  be  redefined  below).  The  equation 


(1.2) 


x  -  no<x,b(x)) 


represents  the  projected  dynamics  on  G  for  the  ODE  x  -  b(x).  Let  x‘(.) 


denote  the  piecewise  linear  interpolation  of  (X£)  with  interpolation  interval 
c.  Under  reasonable  conditions,  X*  converges  in  distribution  to  the  set  of 
stationary  points  of  (1.2),  as  c  -*  0  and  tn  -»  •  ;  also  xe()  -*  x(),  a 
process  which  satisfies  (1.2).  If  b(x)  is  a  gradient  of  the  function  -B(x), 
then  the  limit  points  are  the  Kuhn-Tucker  points  for  the  problem  of 
minimizing  B  (•)  on  G.  Rate  of  convergence  results  for  (1.1)  are  unavail¬ 
able.  For  the  unconstrained  case  the  'rate'  results  are  of  the  following  form. 
Let  Xjj  -»  e  in  distribution  as  c  -*  0,  *n  -•  •.  Define  Uj[  ■  (X*  -  e)/ve, 
and  let  Uc(-)  denote  the  continuous  parameter  interpolation  (interval  e). 
Then,  under  the  appropriate  conditions,  U*(tc+-)  converges  weakly  to  a 
stationary  Gauss-Markov  process  as  c  ■»  0  if  t,  -»  •  fast  enough  [12]  (with 
a  similar  result  for  the  stochastic  approximation  case).  The  result  is  based  on 
a  local  linearization  about  e,  and  the  rate  result  does  not  fully  exploit  the 
dynamics  of  the  iteration.  Such  a  linearization  cannot,  in  any  case,  be  done 
for  (1.1)  when  the  limit  e  is  on  the  boundary  aG.  Some  results  for  this 
case  are  in  [5],  where  {X*}  is  Markov,  and  (under  appropriate  conditions) 
(Xc  -  e)/c  is  shown  to  converge  in  distribution  as  <  -*  0,  where  Xc  is 
distributed  according  to  a  (unique)  invariant  measure  for  (1.1). 

Here,  we  use  the  theory  of  large  deviations  to  get  a  better  picture  of  the 
asymptotic  properties  for  (1.1).  Let  D  denote  a  neighborhood  of  e  (all 
neighborhoods  are  with  respect  to  G),  with  e  a  stable  point  of  (1.2)  and 
with  D  in  the  domain  of  attraction  of  a  Let  «  min  (t:  xe(t)  <  D), 
define  Cx[0,T]  to  be  the  set  of  G-valued  continuous  functions  on  [0,T]  with 
initial  condition  x,  and  let  Px  denote  the  probability  measure  given  that 
x0  ■  x.  We  always  use  d(-,)  to  denote  the  (sup  norm)  distance  between 


functions  in  CJO.T],  as  well  as  the  sup  norm  distance  between  points  in  a 
Euclidean  space  .  As  special  cases  of  our  large  deviations  results,  we  obtain 
estimates  for  quantities  such  as 

(1.3)  lim  e  log  Px{r£  <  T} 

(1.4)  lim  c  log  Px{xc(  )  €  A)  ,  A  c  CJO.T]  , 

lim  c  log  E_t£  . 
c 

The  limits  in  (1.3,  1.4)  are  important  in  studying  the  asymptotic 

properties  of  (X£)  and  are  often  of  greater  interest  than  'local’  results  of  the 
type  of  limits  of  suitably  normalized  (X*  -  e).  'We  can  obtain  the 
(asymptotic)  locations  of  the  exit  from  D  and  the  most  likely  escape  routes, 
all  of  which  are  important  in  applications.  A  comparison  of  (1.3)  for 
different  algorithms  yields  information  on  their  relative  stability.  They 
exploit  more  of  the  structure  of  the  algorithm  than  the  ’local*  limits  do,  and 
often  provide  realistic  information,  e.g.,  estimates  of  the  time  spent  in  a 
neighborhood  of  a  stable  point,  etc. 

The  paper  is  organized  as  follows.  In  Section  2,  various  terms  from  the 
theory  of  large  deviations  are  introduced,  and  the  problem  (on  a  time  interval 
[0,T])  formulated.  Sections  3  and  4  contain  some  technical  results  concerning 
the  action  functional  and  approximations  of  (1.1).  These  are  put  together  in 
Section  3  to  get  the  general  large  deviations  result.  Section  6  concerns  the 
mean  escape  time  of  (1.1)  from  a  neighborhood  of  a  stable  point  of  (1.2)  and 


in  Section  7  we  remark  on  some  extensions  to  the  global  behavior  of  (1.1)  on 
the  infinite  time  interval  the  character  of  movement  from  stable  point 

to  stable  point  and  on  the  stochastic  approximation  case. 


-5- 


Lct  T/A  be  an  integer  and  define  I,  -  {j:  iA/«  <  j  <  (iA+A)/«). 
Suppose  that  the  limit  (defining  b())  in  (2.1)  exists  uniformly  for  x  e  G: 


lim  -  E  E  b(x,U  «  b(x) 


Suppose  that  there  is  a  function  H(-,->  such  that  for  each  A  >  0,  the  limit 
in  (2.2)  exists  uniformly  for  {Xj,^}  in  any  compact  set. 


t/a-i  a  t/a-i  in+n-i 

E  A  H(oj,Xj)  -  lim  -  log  E  exp  [  E  M*^)- 

i=0  N  N  i=0  j=iW 


H(-,.)  is  obviously  continuous,  and  we  suppose  that  H(-,x)  is  continuously 

differentiable.  The  limits  exist  and  we  have  the  differentiability  in  a  if 

{{|}  is  a  finite  state  ergodic  Markov  chain  (see  [7],  where  the  argument  is 

based  on  one  in  [6])  or  if  ^  ■  E  gn.k«|\  and  i\)  is  i.i.d.,  bounded, 

k 

£k  |  gk  |  <  •  and  gk  «  0  for  k  <  0  [8].  Define  the  usual  Legendre 

transform  L  and  action  functional  S  by 


L(0,x)  -  sup[0  a  -  H(a,x)] 

a 

S(T,*)  -  ^  L(A($),  «(s))ds  for  *(.)  absolutely  continuous  c  CJO.T], 

-  •  otherwise 


'  •  •  .  *  «  ^ »  « 


L(-,.)  is  lower  semicontinuous  (l.s.c.),  L(-,x)  is  convex,  and  S(T, .)  is  l.s.c. 
[6].  The  sets  U(x)  -  (0:  L(0,x)  <  •}  and  U0(x)  «  U(x)  -  b(x)  are  convex 
and  are  uniformly  bounded  since  b(x,()  is  uniformly  bounded  [7],  Assume 
that  U(-)  is  continuous  in  the  Hausdorff  topology. 

Define  B(x,0)  »  {v:  nG(*.v)  ■  nG(*.B)}  .  the  set  of  ’velocities’  having  the 
same  projection  at  x  as  0  has.  B(-,.)  is  upper  semicontinuous  (u.s.c.) 
in  the  Hausdorff  topology.  Define 


Lg(0,x)  -  inf  L(v,x) 
v«B(x,B) 


By  the  l.s.c.  of  L(  .  )  and  u.s.c.  of  B(-,-)  ,  LG(-,-)  is  l.s.c.  For  $(•) 
absolutely  continuous,  set 


Sg(T,0)  -  Lc(*(s),«<s))ds 

.  0 


and  define  SG(T,*)  -  •  otherwise. 

Under  some  other  conditions  to  be  introduced  below,  the  main  result  of 
the  next  three  sections  is  that  S^T,*)  is  an  action  functional  for  x€(.)  in 
that  SqOT,-)  is  l.s.c.,  and  for  Ac  Cx[0,T]  (with  interior  A°  and  closure  A), 


liUi  _ 

-inf  S0(T,*)  <  -  c  log  Px{x‘(.)  e  A) 

♦sA°  * 


<  lim  c  log  P  (x‘(  )  c  A)  (  -inf  SG(T,*) 
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Owing  to  the  presence  of  the  boundary,  the  analysis  is  somewhat 

non-standard.  The  typical  results  of  interest  require  that  the  boundary  be 

taken  into  account.  See,  e.g..  Fig.  1.  where  D  is  a  set  in  the  domain  of 

attraction  of  e,  a  stable  point  of  (1.2),  and  the  arrows  indicate  the  flow  lines 
for  (1.2).  With  the  indicated  ’typical’  U(x),  the  most  likely  escape  paths  from 
D  are  along  the  boundary  [a,b]. 

To  relate  (2.5)  to  (1.3),  let  A  -  {$(•):  *(0)  «  x,  $(t)  <  D  for  some 
t  <  T). 

Write  the  vectors  x,b,0,etc.  as  (Xj.Xj),  (brbj),  (P2,/J2),  etc.,  where  Xj,b,, 
have  dimension  r2  and  Xj.bj,  have  dimension  r2  for  some  r^rj.  £ 

(resp.,  E22)  below  is  r2  x  r2  (resp.,  r2  x  r2).  For  the  purposes  of 

simplifying  the  analysis  from  Section  4  on,  we  make  an  additional  assumption. 
Define  E(-)  and  EN()  by 


(2.6) 


lim  -  cov  e  [b(x,ip  -  b(x)J  s  lim  E„(x)  -  E(x) 


l  (*)  l  (x) 
11  12 


E  (x)  E  (x) 

31  32 


We  use  either  of  the  two  cases.  Case  1  (non-deeenerate>.  where  E(x)  is 
positive  definite  on  G.  Case  2  (degenerate),  where  Eu(x)  -  Ejt(x)  -  E12(x)  « 
0,  and  E2s(x)  is  positive  definite  on  G.  These  cover  the  typical  cases  in 
applications.  In  Case  2,  L(0,x)  -  •  unless  fti  -  bt(x).  Define  U2(x)  »  { f!t : 


LO^xXflj.x)  <  •}  and  define  the  S-intcrior  sets  U8(x)  -  {Pc  U(x):  d(P,dU(x)) 
>  6),  Uf(x)  -  {0t  c  Uf(x):  d(0t,dUt(x))  >  8). 

Non-deaenerate  case.  H(.,x)  is  strictly  convex  in  a  neighborhood  of 
«  -  0  ,  uniformly  in  x  in  G,  and  for  any  8  >  0  L(-,-)  is  uniformly 
continuous  for  (p,x)  c  {U8(x),  x  c  G).  Also  L(P,x)  -  0  iff  p  -  b(x),  and 
there  is  a  neighborhood  N  of  the  origin  such  that  N  +  b(x)  c  U(x)  for  all 
x  c  G  [7]  ,  and  L(b(x)  +  -,x)  is  strictly  convex  on  N,  uniformly  in  x  in 


Here  the  definition  (2.2)  reduces  to 


H(«,x)  -  «•  bj(f>  +  Ht(at,x), 


where  H2(.,)  is  defined  by  (write  ^  -(a^.a^)) 


n/*-x  A  t/a-i 

(2.7)  £  A  Hj(a2,,Xi)  -  lira  -  log  E  exp  E  E  b,(x.,r) 

i=o  N  N  1=0  jclj 


Let  L,(P,»x)  be  the  dual  of  H2(«2,x).  Then  H2(.,.)  and  L2(-,)  have 
the  properties  ascribed  above  to  H(,)  and  L( - , - );  also  L(p,x)  -  L,(P,,x) 
if  P.  ■  b.(x)  and  L(P,x)  -  •  otherwise.  The  following  result  will  be 


Lamna  L  Lsi  ^  *  u8(x)  aM  ^  -  v.  Then  lMx)  -  l(v,x). 

For  8  >  0, 

(2.8)  L(b(x)  +  (l-6)v,x)  <  L(b(x)  +  v,x*  x  c  G,  all  v. 

The  first  assertion  is  in  Freidiin  [6J.  The  second  is  a  consequence  of  the 

convexity  of  L(-,x)  and  the  fact  that  L(b(x),x)  ■  0,  L(b(x)  +  v,x)  >  0. 
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III.  Discrete  Approximations:  Preliminary  Formulation 

Owing  to  the  boundary  aG,  it  is  hard  to  get  the  large  deviations  results 
for  (1.1)  directly.  We  do  it  in  a  sequence  of  approximations,  which  get  ’closer’ 
to  (1.1),  and  for  each  of  which  we  can  get  a  large  deviations  result  from  the 
preceeding  one.  We  define  the  approximations  in  this  section.  «|<-)  and 
«(•)  denote  arbitrary  functions  in  CjJO.TJ.  For  A  >  0  (w.l.o.g.  we  let 

T/a  and  A/c  be  integers,  with  NA  ■  T)  define  «|^  to  equal  «|<(nA)  for 
j  c  4*  and  define  the  sequence  i  <  N)  by  Yj,,|'’A  ■  x  and 


(3.1)  Y«*A  +  c  £  t**^) 

j=I, 

Let  <|A(  )  denote  the  piecewise  constant  (on  intervals  [nA,nA  +  A))  inter¬ 
polation  of  {^i*)}.  We  use  to  represent  the  samples  (*|{i*),  o  <  i  <  n). 

We  first  state  a  large  deviations  result  for  (3.1),  then  use  the  ’contraction’ 
principle  to  get  such  a  result  for  a  projected  form  of  (3.1),  and  then  take 
appropriate  limits  as  A  -»  0.  In  [6],  Freidlin  developed  the  large  deviations 
theory  for  (3.1).  The  details  in  [6]  were  for  a  continuous  parameter  case,  but 
the  results  and  methods  would  be  identical  for  the  discrete  parameter  case 

(3.1) .  In  particular,  the  following  results  hold.  Define  S^A  by 


(3.2) 


N-l 

S**(T,*)  -  I  A  L  (■ 
o 


♦(!*+*)  -  #(U) 


+0*)). 


Then  Sfc*0»  is  an  action  functional  for  i  <  N)  in  the  sense 

that  for  any  Borel  set  B  in  (Rr)N 


inf  S^T,*)  <  lim  «  log  P^Y,**4  ,o  <  i  <  N)  c  B} 
♦a€B°  « 


(  lim  c  log  Px{{Yl*,,*,'A,  0  <  «  <  *0  c  B} 


<  -  inf_  S^T,*)  . 

♦A«  B 


For  *(-)  c  C  [0,T],  define 


S^*(T,0)  -  inf  S*V,f)t 
f 


where  the  inf  is  ove 


{f:  nG(4>0A)  +  f(U+a)  -  f(U»  -  *U+a),  l  <  N-l). 


For  later  use,  it  is  more  convenient  to  rewrite  (3.4)  in  the  form 


.  N!1 _  .  rf(U+a)  -  f(i*)  ^ 

Sq  (T,$)  ■  £  A  inf  L  f  »  44ia)J* 

o  f  A 


where  the  inf  is  over  the  same  set  as  for  (3.4).  By  the  ’contraction  principle’ 
f 

([9],p5)>  Sjfc^T,#)  is  the  action  functional  for  the  ’projected’  sequence 
{Xf*4  i  (  N)  defined  by  X^*-4  -  x  and 


(3.6)  nG  (xt  c  E  b(^.«p)  • 


In  the  next  section  we  prove  (Theorem  1)  that  S§*4(T,0)  s  S£(T,$)  is  an 
action  functional  for  the  next  approximation  (Xj**4,  i  <  n)  defined  by  X|’4 
-  x  and 


(3.7) 


lie  (Xj  +  *  E  b(Xj  ,  (j))  . 
Jdi 


Let  xe,4(.)  denote  the  piecewise  constant  interpolation  of  {X^*4} 
(interpolation  interval  A).  For  a  set  Ac  CJO.T]  it  will  sometimes  be 
convenient  to  use  the  ’sampled’  notation  xe,4(.)  c  A4  to  mean  that  Xj€»4  ■ 
4(iA)  for  i  <  N  for  some  «(•)  c  A. 

In  Theorem  2,  the  l.s.c.  of  S^(T,*)  is  proved,  as  is  the  relation 


(3.8) 


-inf  Sg(T,0)  <  lim  lim  c  log  Px{x‘,4(.)  c  A4) 

♦«A°  T"  T 


«  lim  lim  c  log  P_{x<,4(.)  c  A4) 

AC  A 


4  inf  S0(T,$) 


$s 


Write  Xe'A  •  {Xf’A,  0  <  i  <  N),  Xc*^  A  -  {Xf‘+>  *  0  <  i  <  N).  ** 
MU),  0  <  i  <  N). 


X<A;  L 


Eat _ each  A  >  0  ,  S£(T» 

nv  Borel  set  B  in  (Rr  )N, 


-  inf  S£(T,*)  «  lim  c  log  Px  (X‘*A  c  B  }  <  lim  c  log  Px(X«*A  c  B) 
e  B°  T"  * 


<  -  inf  Sg(T,0) 
♦SB 


Proof.  Given  6  >  0,  there  is  a  6X  >  0  (where  -•  0  as  6  -*  0)  such 


dCX***  ,♦*)  <  8t  «►  dtX*-*,^)  <6  . 


To  see  this,  write  «(Ki*)  -  Xf^A  +  where  |«j|  <  8t  and 


(4.2a) 


(4.2b) 


X  “  n0  (Xj  ♦  «  E  b(Xt  +  (p) 

‘+1  icl| 


x‘:t  -  no  <x»‘,A  + «  £  b(x‘,A,t,)) 
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The  existence  of  a  suitable  follows  from  this  and  the  relation 
|IIG  (x)  -  nG(x')|  <  jx-x'l  (due  to  convexity  of  G) 

and  the  uniform  Lipschitz  condition  on  b(-,()- 

Similarly  there  is  a  6t  >  0  (where  64  -*  0  as  6  ■*  0)  such  that 


d(X€*A,*A)  <  6  «> 

Thus 

(4.3) 

px{d(X«***A)  <  6}  > 

Px{d(X«.^,^)  <  6ij  t 

(4.4) 

Px{d(X«  ^A,*A)  <  5,} 

>  Px{d(Xe’**A)  <  b)  . 

The  result  follows  from  (4.3)  -  (4.4)  in  the  standard  way  [6]. 
particular,  given  B  (with  non-empty  BP)  and  h  >  0,  there  are 

«1>A  c  B°  and  small  S,8p  such  that 

Px{X«’A  c  B)  >  px(X‘«A  c  B°)  > 

Px(d(X‘-*,*A)  <  8}  > 

Px<d(X«’M»,**)  <  6t  )  > 

exp  -  f[SGtl»  +  h] 


In 

with 


l6- 


for  small  c.  The  left  side  of  (4.1)  follows  from  an  appropriate  choice  of 
(a  ’tail’  element  of  the  infimizing  sequence).  The  right  hand  inequality  of 
(4.1)  follows  from  a  similar  approximation  arguement.  Q.E.D. 

IllfifllfigL 2.  Sg(T,.)  ii  I.S.C.  For  each  A  *  CJO.T], 

(4.5)  lim  inf.  sJt*)  >  inf.  SG0». 

A  ♦cA*  0cA 

For— cash  ♦(•)  for  -Which  S^(T,«)  <  •,  there  are  Piecewise  constant  ton 
intervals  of  length  A)  functions  ♦*(•),  <|>A()  converging  uniformly  to  *(•) 
such  that 

(4.6)  IIS  4  SG  (T.o)  . 

A 

The  inequalities  (3.8)  hold. 

Proof.  Part  I.  The  proofs  for  the  degenerate  and  non-degenerate  cases  are 
essentially  the  same  and  we  do  the  latter  case  only  (for  the  degenerate  case, 
use  U*  in  lieu  of  U®  below). 

Proof  of  the  l.s.c.  e£  §q(T,.).  Let  4j,(  )  •*  ♦(•).  The  infimizing  v  is 

inf  L(v,*B(s)).  Let  ^“(.)  be  a  measurable  selection 
v  «  B(*n(s),*n(s)) 


•  *  •  *. 


.  A 


attained  in 


*[11,  Thm.4.1]  of  the  minimizer  and  write  it  as  v“(s)  -  vB($)  +  b($“(s)).  Using  the 
uniform  continuity  of  L(.,.)  on  {0,x  :  0  e  U8(x),  x  c  G)  for  8>  0  we  have 


(4.7) 


lim  lim 

~  SG(T,*n)  “  ~  I  L  p(*D(s))  +  v“(s).  ♦“(»)] ds 


lim  lim  ^ 

““Jo 


L  [b(0n(s))  +  (l-8)vB(s),*n(s))ds 


X 

lim  lim  1 

““jo 


L  (b(*(s))  +  (l*8)vB(s),tfs))ds 


L  (b(+(s))  +  (l-8)v“(s)V(s)]ds  -  a* 


=) 


* 

lim  lim  lim 

8  A  n 

b 


lim  lim  flim  N‘x  ri  U£A-  .  „  .  _  1  A 

— r - —  { -  I  A  L[-  J  [  b(*(s))  +  (1*8)  Vn(s)  ]dsjKiA)J- 

o  AID  a  VA  *  * 


where  «£*•()  as  A-»0  for  each  6  >  0.  The  first  inequality  uses 
Lemma  1,  and  the  last  inequality  follows  from  Jensens’  inequality  and  the 
convexity  of  L(-,x). 


The  selection  theorem  4.1  in  [11]  uses  a  bounded  u.s.c.  function  and  a 
maximization.  But  a  slight  modification  works  for  our  case,  since  the 
l.s.c.  L(  • ,  • )  is  bounded  from  below. 


Choose  a  subsequence  such  that  |q  vn(s)ds  converges,  with  limit  denoted 
by  (absolutely  continuous  since  the  U(x)  are  bounded)  V(.)  ,  and  write 
V(t)  -  Jq  v(s)ds.  By  the  l.s.c.  of  L(-,-)  and  Fatou’s  lemma,  we  can  continue 
the  string  of  inequalities  in  (4.7)  as 

(4.7')  >  F  L(b(*(s))  +  (l-8)v(s)^(s))ds 

6  Jo 

»  F L(b($(s))  +  v(s)^(s))ds  . 

If  HjMsJ.b^s))  +  v(s))  *  *(s)  for  almost  all  s  <  T  we  are  done,  since  in 
that  case  (for  almost  all  s)  b(*(s)  +  v(s))  c  B(*(s),*(s))  and 

L(b(*s))  +  v(s),*(s)) 

(4.8)  >  inf  L(v,«(s))  -  LG(*(s),  *(s))  . 

v  «  B(«(s),0(s)) 

Thus,  we  need  only  show  the  projection  property  below  (4.7 ' )  for  b(*(s))  + 
v(s)  -  v(s). 

If  for  some  s  <  T,  +(s)  c  G°,  the  interior  of  G,  then  V\t)  »  ♦n(t)  for 

almost  all  t  and  large  n,  on  some  open  interval  containing  s.  This 

implies  that  v(s)  -  $(s)  for  almost  all  s  such  that  *(s)  c  G°.  Now,  let 

*(s)  c  OG  on  some  interval.  In  particular  let  I  ■  [a,b],  a  <  b,  be  such  that 

(rearrange  the  indices  is  necessary)  for  some  6  >  0,  and  integer  i  and  all 
s  €  I,  qj(*(s))  -  0,  i  <  1,  qj(*(s))  <  -8  <  0,  i  >  I.  Define  the  set  G(*)  «  {y: 


Qj(y)  <  0,  i  <  I).  Let  C(x)  denote  the  cone  generated  by  the  outer  normals 

to  {y:  q^y)  <  0),  i  <  f,  at  the  point  x.  Then  C(«P(s))  -  C(«(s))  for  s  c  I. 
Define  the  ’projection  error’  v"(s)  by 


v*Xs)  -  v"(s)  -  n<s(0j (♦n(s),v“(s))  -  *n(s). 


Then  v"(s)  c  C($n(s)),  if  $n(s)  e  aG($).  Otherwise  v^s)  -  0.  Extracting  a 
convergent  subsequence  if  necessary,  there  is  an  absolutely  continuous  function 

A 

V(.)  such  that 


P  v“(s)ds  *♦  V(t)  s  P  v(s)ds,  t  <  b  . 
'a  'a 


Note  that  *(•)  moves  orthogonally  to  C(*(s))  at  s  (recall  that  the 
active  constraints  for  *(s)  do  not  change  for  s  c  I).  Since  ef(-)  -»  «(■) 
and  «(s)  c  dG(<t>)  on  I, 


♦n(a)  +  £  [vn(s)  -  vn(s)]ds  -  0(a)  +  |*  [v(s)  -  v(s)]ds  -  *(t),  t  <  b. 


Thus  v(s)  -  v(s)  ■  *(s)  for  a.a.  s  c  I.  By  construction,  ><s)  ±  v(s)  -  v(s) 
and  v(s)  <  C(*(s))  for  almost  all  s  s  I.  Thus  fyj(*(s),  ?(s))  -  nG(*(s),  ^(*)  * 
v(s)  +  v(s))  -  nG(*<s),  V(s)  -  v(s))  -  nG(«(s),  i(s)).  a.a.  s  «  I.  By  this 
method,  we  can  show  that  *(s)  -  nG(*(s),V(s))  for  a.a.  s  c  [0,Tj  and  the 
’projection’  requirement  below  (4.71)  holds.  Thus  Sa(T,-)  is  i.s.c. 


Part  2.  We  can  write 


Sq(T,$)  ■  Jo  inf  L(u,*A(s))ds 


where  on  the  interval  fiA,iA  +  A),  inf  is  the  inf  over  all  u  such  that 

u 


nG(*(U)  +  AU)  -  4KiA)  itiA  +  a)  -  0(i a) 


By  using  this  and  an  argument  very  similar  to  that  used  to  get  the  l.s.c. 
property  in  Part  1,  we  can  show  that 


lim  a 

T  SG(1»  >  SG(T.*). 

A 


Also,  if  ♦(.)  -*  *(•)  (*A(-)  being  piecewise  constant),  a  similar  argument 

yields 


(4.10) 


lim  a 

—  Sg(T,0a)  >  Sg(T,0). 


We  now  prove  (4.5).  Let  inf  SG(T,*)  <  •  and  let  *  (•)  attain  the 

*«A 

inf.  Let  (piecewise  constant)  *A()  yield  the  inf  in  in£_  Sq(T,*).  Since 

♦cA* 

{(♦A(u  +  a)  -  *a(U))/a,  U  i  T)  is  bounded,  we  can  extract  a  convergent 
subsequence  of  the  piecewise  constant  functions  (+A(-)}  with  an  absolutely 
continuous  limit  *(■).  Then  by  (4.10) 
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lim  a 

Sq(T,0  )  >  SG(T,*)  >  Sg(T,0o)  , 

A 

which  implies  that  (4.5)  holds,  together  with  the  right  side  of  (3.8).  If 

inf_  Sg(T,*)  *  «,  then  the  above  argument  yields  that  l-LID  S£(T,*A)  ■  *  also. 
0€  A 

Inequality  (4.6)  yields  the  left  side  (3.8),  by  use  of  the  following 
observations.  (See  Part  1  for  a  related  argument.)  If  A0  is  not  empty,  then 
there  is  a  ’nearly  infimizing’  ${•)  c  A0  and  a  6  >  0  such  that  for  small 
€,A, 

Px{(x*’V)  e  A*}  >  P*fx«»*(.)  €  (A*)0} 

(4.11a) 

>  Px(d(xe*V ),*(.))  <  8)  . 

Given  h  >  0  there  is  a  61  >  0  and  Cq  >  0  such  that  for  c  <  «0  and 
small  A  and  3<A(-).*A(-)  close  enough  to  $(■),  we  can  continue  the 
inequalities  as  (using  (4.6)) 

>  P{d(x*’^A,V),  $*(-))  *  6J) 

H.llb)  >  exp  -  a[S*V4(T,*a)  +  h] 

>  exp  -  4.  [SG(T,o)  +  2h]. 

Thus,  to  get  the  l.h.s.  of  (3.8)  only  (4.6)  needs  to  be  proved. 

To  do  this,  we  adapt  an  argument  of  Freidlin  [6]. 


Part  3. 


i.  Wc  can  write 


Sg(T,$)  ■  I  inf  sup[«'v  -  H(a,e>(»))]ds. 

0  v  c  B(*(.),*(s)) 

The  inf  is  realized;  let  v(.)  be  a  measurable  selection  of  the  minimizer 
and  define  V  (t)  ■  x  +  Jg  v(s)ds.  We  have 

(4.12)  SG(T,*)  >  I  sup  [a'v(s)  -  H(<*,*(s))]ds. 

i=0  a  *  iA 

The  sups  in  (4.12)  are  attained  at  some  {«^,  0  <  i  <  N}.  There  are  sj  c 
[i*,iA  +  a)  such  that 


A  Ji*+AH^<*t»0(s^ds  "  H(«i^s|)). 


Define  to  be  the  function  with  value  «(^)  on  [iA,iA  +  A).  As 

A  •*  0,  »|>A(.)  -»$(•)  uniformly  on  {0,TJ.  Thus 


S0(T,*) 


N-l 

>  Z  A 
0 


Vtvqa+A)  -  V(iA))  -  H(^A(iA)) 


(4.13) 


K-l  L 
-  I  A  Lpjj 


<ia+a)  -  V(iA),  * 


»«*)]• 


Define  *A( 

♦A(i*+A>  -  I 


(4.11) 


•  )  to  be  the  piecewise  linear  function  with  samples  *4(o) 
1g(4>a(U)  +  V  (U+A)  -  V  (iA)).  Then  by  (4.13) 


N-l 

SC(T,$)  >  l  L  inf  L(v,  «|.A(i*)) 

i=o  v  :  nc(*A(u)  +  av)  -  0A(iA+A) 


A  proof  similar  to  that  of  Theorem  3  below  yields  that  $A(-) 
uniformly  on  [0,TJ.  Thus  (4.6)  is  proved.  Q.E.D. 


In  order  to  extend  Theorem  1  (for  the  x<,A()  process)  to  the  xc() 
process,  we  need  to  show  that  the  processes  are  close  for  small  c,A.  Let  A  « 
kc  ,  k  being  a  large  integer.  Recall  the  definitions 


(5.1)  -  nG(^;A  +  «  E  brfptp).  Is  -  (ik  <  j  <  ik+  k),  i  <  T/A 


(5.2) 


xj+i  “  MX*  +  tbiX*,^))  ,  j  <  T/c  -  kT/A  . 


To  extend  the  large  deviations  results  to  x*()  it  is  sufficient  (Theorem 
4  below)  to  show  that  for  each  1  <  •,  #(•)  and  6  >  0,  there  are  5  -t  >  0 
which  tend  to  zero  as  6-0  such  that  for  small  enough  A 


(5.3)  d(x€’A{. ),*(.))  <  Sj  ♦d(x*(. ),♦(•))  <  6 

w«d(xc*A(. ),*.))  <  6, 


Let  pj 


— €  —  €  , 

b(0(j«),t)  and  define  the  processes  (Xn,n  4  T/«),  (X^, 
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^tk+k  *  ^G^^ik  +  «  E  Pj)» 

j«Ii 

and  their  continous  parameter  interpolations  x*^)  (interval  c)  and  xe,A(.) 
(interval  A). 

By  the  convexity  of  G  and  the  Lipschitz  condition  on  b(-,()>  given 
6  <  0  (resp.,  6*  <  0)  there  are  6^  (resp.,  6^')  going  to  zero  as  6  (resp., 
6*)  goes  to  zero  and  such  that 

d(x *’*(•),♦)  <  Bx  «►  d(x*»A{ - ).♦)  <  6  d(x*’4(.),«0  <  b, 

d(x* {-),♦)  <  8J  »d(x« (.),♦)  <  •'  ♦  d(x*  (•),*)  <  6,' 

Thus,  to  show  (5.3),  we  need  only  show  that  d(xe,A(  ),  x* ( - ))  -*  0  as  A  -•  0, 
A/e  **  ®.  We  will  actually  bound  X^|,  eik  <  T. 

For  notational  convenience,  let  |pj|<l  and  absorb  any  other  bound  into 
the  e. 

The  basic  idea  is  to  show  that  if  the  two  processes  ever  separate  by 
A1/*,  then  the  maximum  rate  of  growth  of  the  separation  is  then  slow  enough 
for  them  to  stay  close.  The  following  lemma  will  be  used  the  proof. 

Lemma  2-  Lai  ^  b$_La  G,  with  v  -  xx  -  xr  Eiz.  y  >  0.  Lsi  yx  « 
Ny(Xj)  (1  G,  y,  I  Ny(xx)  ,  and.  w  -  y,  -  yr  Then 
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v 

(5.4)  <nG(y,)  -  y3,  —  >  <  yiwi/|V|  . 

m 

Proof.  (See  Fig.  2)  If  y3  c  G,  there  is  nothing  to  prove.  Let  y3  <  G. 
Consider  the  hyperplane  defined  by  the  normal  [^(y3)  *  y3  and  point  y3. 
Since  G  is  convex,  Xj  lies  on  the  same  side  of  this  hyperplane  as  does 
nG(y3).  Thus 

<nG(y,)  -  y2,  x3  •  y3>  >  0  or 

<nG(y,)  -  y3.  (x3  *  x3)  +  (Xj  -  y3)>  >  0  which  implies 
<nG(ya)  *  yj*  “  >  “  <nG(ys>  •  y*  *r  y»>  • 

Since  11^3)  *  ya»  <  iwi  and  tXj  *  y3i  «  y  ,  the  lemma  follows.  Q.E.D. 


Theorem  3. 


0. 


lim  lim  sup  |X^  -  |  - 

A  t  ik«T/A 

Proof.  We  use  Lemma  2,  where  we  identify  with  Xj  and  X^  with 

Xj.  Let  n  c  [ik,ik  +  k)  and  set  ya  -  X  n,  y,  -  XB  +  «pB.  Thus  fl^yj)  - 
nG(5£  +  «pn).  Since  |pB|  <  1  and  Ifc()  is  a  contraction  and  k  -  A/c, 
we  can  use  the  value  A  for  y.  Define  d^  ■  JX^  -  Xy,  |.  Then  the 
lemma  yields 


<nG(x  *  +  «pn)  ■  (x‘  +  « pn) ,  x£  -  >  <  «Vdk 

or,  equivalently, 

<Xn+l  *  Xn  *  Cpn  «  Xik  '  ^  «A/dk  . 

Tk 


Summing  from  ik  to  ik  +  k  -  1  yields 


(5.6) 
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Subtracting  (5.6)  from  (5.5)  and  defining  ■  X^  -  yields 


<*&  -  y*4’  *2*  'ds><  ^‘/dk. 


Suppose  that  \  >  Ax/*  .  Then  (5.7)  implies  that  the  component  of 
Y^’*k  in  the  direction  Y^*  has  magnitude  less  than  or  equal  to  2A*/dk  ♦ 
dk  <  2a8^2  +  dk  .  The  bound  |  ^  |  <  1  implies  that  the  projection  of 
Yik+k  onto  the  hyperplane  normal  to  YjJ  has  magnitude  <  2a.  (In  fact 
|^+k  *  Y*kA|  <  2A->  Thus,  if  dk  >  A1'*, 


J  */*  ^  v*  ,  * 

dk+1  «  (2A  +  dk)  ♦  4a  . 


Let  denote  the  maximum  distance  across  G.  Then  (for  dk  »  A1'5  and 
2a3/j  <  1) 


or,  in  general 


dk+l  <  (2  +  4  kj)AS/1  ♦  4a*  +  dk 


d|  «  max  [a1/*.  (6  +  4kj)^  .  A*^*] 


Q.E.D. 


It  follows  from  Theorems  2  and  3  that 


Theorem  A.  SG(T,*)  is  an  action  functional  for  (xe()}  and  (2.5)  holds. 

Proof.  Fix  the  set  A,  and  let  *  c  A0.  Using  (5.3)  select  6  >  0,  Bj  >  0 
such  that  Ng(«)  <=  A°  and  for  small  a 

{d(x«*A(- ),♦(•))<  8,}  c  {d(x« (  ),♦(.))  <  8}. 

It  then  follows  that 


1^0-  c  log  Px{x«()  c  A} 

>  H®-  c  log  Px{d(x€(- ),♦(  ))  <  8} 

>  lim.  lim.«  log  Px{d(x‘,A(  •),*(•))  <6^ 

>  -  sja>) 

where  the  last  inequality  is  due  to  (3.8).  This  gives  the  left  hand  side  of 
(2.5). 

Since  the  estimates  in  Theorem  3  are  independent  of  the  particular  « 
chosen  we  have 

Px{xe(.)  «  A}  <  Px{x‘«A(  )  «  N8(A)} 


*  Fv 


■ 
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for  any  6  >  0  and  small  enough  c.A.  Hence  by  (3.8) 


i 


lim  c  log  Px{x‘(.)  c  A) 


<  *  •**“  *nf  SC(T,*). 
8  #«N8(A) 


I 


Since  by  l.s.c. 


i 


lim.  inf  SC(T,*) 
6  *«N6(A) 


>  inf.  S0(T,*)  , 
♦«A 


the  right  hand  side  of  (2.5)  is  proved.  Q.E.D. 


Let  e  be  an  asymptotically  stable  point  of  (6.1),  and  D  a  neighborhood 


(relative  to  G)  of  6  with  D  in  the  domain  of  attraction  of  a 

(6.1)  x  -  n<j(x.b(*)) 

Let  7p  denote  the  escape  time  of  xc(-)  from  D.  Then,  under  some 
additional  assumptions,  we  will  prove  the  analog  of  the  classical  case  [6],  [10], 
namely 

(6.2)  Hm  c  log  -  SD(e),  x  c  D, 

where 


SD(e)  -  inf  (So(T,0):  *0)  -  a  •CD  «  «D>  . 

All  neighborhoods  are  relative  to  G. 

In  order  to  avoid  excess  detail,  we  work  with  the  non-degenerate  case, 
(see  below  (2.6))  but  the  results  hold  for  the  non -degenerate  case  as  well,  if  we 
assume  the  existence  of  the  <?(•)  discussed  below  (6.3). 

Since 

Lo(0,x)  -  inf  L(b(x)  +  u,x) 

u  :  n0(*.b(x)  +  u)  -  p 

and  L(a,x)  ■  0  if  and  only  if  a  -  b(x)  ,  we  have 
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(6.3)  LG(0,x)  «  0  iff  p  -  nG(x,b(x))  . 

Loosely  speaking,  we  ’pay’  only  when  noise  or  ’control’  u  is  required  to 
force  a  deviation  from  the  (free)  path  of  (6.1). 

For  each  small  pt  >  0  and  Np  (e)  there  is  a  pt  (pt  •*  0  as  p%  -  0) 

and  a  T0  such  that  all  paths  of  (6.1)  starting  in  D  reach  Npj(e)  by  time 

T0  and  do  not  leave  Np^(e)  after  first  hitting  oNpj(e).  By  the 

non-degeneracy  assumption  L(b(x)  +  u,x)  is  strictly  convex  in  u  and  equals 
o(u)  uniformly  in  x  c  G.  This  implies  the  following.  For  each  p  >  0 

there  are  T  <  •  and  pt  >  0  (pt  -  0  as  p  •*  0)  such  that  for  each  x  e 
Npj(e)  there  is  a  path  «f(.)  such  that  «f(0)  -  x,  4>*(tx)  ■  e  for  some  t*  < 
T  and  ^(t*,  ♦*)  <  p. 

It  is  sometimes  convenient  to  define  for  t  >  tx  without  increasing 
the  cost.  To  do  this,  we  let  «?*(•)  satisfy  (6.1)  beyond  f*.  For  suitable  p^ 
(going  to  zero  as  p  and  pl  -  0),  we  can  suppose  that  never  leaves 

Npj(e). 

Define  (if  the  set  is  empty,  define  the  inf  to  be  •) 

SD(x)  -  inf  {SG(T,*>  :  «(0)  -  x,  «(T)  <  D,  T  <  -). 

Given  6  >  0,  there  is  a  T6  <  •  and  for  each  x  <  D  a  path  iF(.) 
on  the  interval  [0,T6]  such  that  3?K0)  ■  x*  5*(tx)  «  9D  at  some  t*  <  T6, 
$x(.)  satisfies  (6.1)  after  f  and  Sq(T6,$x)  <  SD(x)  +  6.  This  fact,  which 
will  be  used  frequently  in  the  sequel,  follows  from  the  following  two 
observations: 


(1)  For  each  p  >  0  and  any  set  of  paths  in  G  with  uniformly 

bounded  costs,  there  is  a  Tp  <  •  such  that  the  paths  must  spend  all 
but  at  most  Tp  units  of  time  in  Np(e); 

(2)  There  is  a  Tp  <  ■  and  paths  «{(•)  on  the  interval  [0,Tp]  taking 

x  c  Np(e)  and  then  to  e  and  then  to  «D  at  some  time  <  Tp,  with 

cost  <  SD(e)  +  pj,  where  px  -*  0  as  p  -»  0. 

We  will  next  show  that  Sjj(x)  SD(e)  as  x  -*  a  By  the  comments  in 

the  previous  paragraphs,  S^x)  <  SD(e)  for  x  e  D,  since  x  can  be 

connected  to  e  by  a  path  with  arbitrarily  small  cost.  If  lim  SD(y)  -  •, 

y-e 

then  we  are  done.  Thus,  we  need  only  work  with  sequences  x^  -»  e  such 
that  sgp  SD(xn)  <  «.  Let  x,,  -  e  and  fix  6  >  0.  There  are  <£(•)  and 

bounded  if  (by,  say  T)  such  that  <£«))  -  xft,  *8(T8)  c  #D  and  SG(T8«»8) 

<  SD^xn)  +  8-  For  T  »  t  >  T8,  let  «£(•)  satisfy  (6.1).  Choose  (and  index 
by  n)  a  convergent  subsequence  of  (T8,  *8())  with  limit  (T,*8()}.  Then 
♦8(0)  -  e  and  «>8(T)  «  aD.  By  the  l.s.c.  of  S^T,.), 


6  +  lim  Sd(xb)  a  lim  S^T,*8)  >  SC(T,*)  -  Sc(T,o)  >  SD(eX 


Thus,  since  6  is  arbitrary 


lim  SD(x)  ■  SD(e). 
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Assumptions.  We  carry  over  the  assumptions  from  Sections  1  and  2.  For 
the  escape  time  problem  one  must  redefine  the  b(x)  and  H(a,x)  of  (2.1) 
and  (2.2).  Let  Bm  be  the  minimal  a-algebra  measuring  U,,  i  <  m)  and  if 

M  is  a  stopping  time  for  let  denote  the  associated  e-algcbra.  If 

i 

iT  is  a  stopping  time  for  the  continuous  parameter  process  x€(  ),  we  use 
Bc(t)  instead  of  B|t/€j  •  Suppose  that  the  limits  in  (6.4)  and  (6.5)  exist 
uniformly  in  the  stopping  time  M,  in  w  ,  and  in  (x,a)  in  any  compact  set, 
and  that  H(-,x)  is  differentiable.  Those  properties  hold  for  lh£— B£fl££5S£& 
listed  below  (2.2). 

t  n+M-l 

(6.4)  b(x)  -  lim  -  E„  I  b(x,tp 

n  n  Til 

t  a+M-l 

(6.5)  H(o,x)  -  lim  -log  E„  exp  o'  E  Wx.tp  , 

O  Q  w  j^l 

For  t  a  stopping  time  for  x€(-)»  let  PXib((t)  denote  the  conditional 

probability  measure  of  the  process  x*()  which  is  reset  to  x  at  JilPC  t, 

then  evolves  as  before  (using  ^  after  time  t),  conditioned  Ctt 

the  data  Bt(r)  up  to  t.  Define  S^T.A)  -  inf  SG(T,*).  Then  for  each  T 

*cA 

<  •  and  A  c  CJO.T],  we  have 


(6.6a) 


-  S0(T,A°)  <  —  «  log  PXfB<(T){xe(T+  •)  *  A) 

<  lim  c  log  PXib€(t)^x<^t  +  •)  «  A)  <  ’  SC(T,A) 
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The  ’rate’  at  which  the  inequalities  hold  is  uniform  in  r  and  u>  in  the  sense 
that  (e.g),  for  each  h  >  0  there  is  «0  >  0  such  that  for  all  t,«d  and  «<«(,, 

(6.6b) exp  -  [Sg(T,A°)  +  h]/c  <  Px^  {t){x£(t  +  .)  «  A)  <  exp  -  [SG(T,A)  -  h]/e. 

The  ’uniformity*  in  (6.6)  follows  from  the  uniformity  of  the  convergence 
in  (6.4)  and  (6.5)  in  the  same  variables  (<d,m).  In  fact,  our  derivation  started 
with  (3.2),  (3.3),  obtainable  from  [6].  It  follows  from  the  derivation  in  [6] 
(although  not  mentioned  explicitly  there)  that  the  probability  in  (3.3)  can  be 
replaced  by  P  B  {{Y.e ’**'■*,  i  <  N)  e  B)  with  the  ’rate’  at  which  the 

M 

inequalities  hold  being  uniform  in  all  variables  (<a,M)  in  which  the 
convergence  in  (6.4),  (6.5)  is  uniform.  Here,  PxB^  denotes  the  probability 
measure  (conditioned  on  B^)  of  (X£)  (or  {Y.€^>A})  reset  to  x  at  time 
M,  then  evolving  as  before  (using  tj+M,j  >  0,  after  M). 

We  will  make  one  additional  assumption.  Let  Dg  denote  a 
6-neighborhood  of  D  with  D0  ■  D.  Then,  clearly,  Sj^  (8)  decreases  as 
6  l  0.  We  assume  that  Sj^  (8)  i  SD(e)  as  6  l  0.  If  this  condition  doesn’t 
hold  for  D  it  will  hold  for  an  arbitrarily  small  perturbation  of  D.  If  e 
lies  in  the  interior  of  G  and  if  the  optimal  exit  path  does  not  hit  0G,  then 
this  condition  is  implied  by  the  non-degeneracy  assumption. 

Under  the  assumptions  in  the  above  subsection. 


Theorem  5. 


(6.2)  holds. 


Proof.  Part  1.  We  follow  [10J  as  closely  as  possible  and  omit  details  when 
they  are  sufficiently  close  to  those  in  (10].  Assume  ^(e)  <  •.  Otherwise,  a 
similar  proof  yields  the  result  (in  fact,  xc(-)  cannot  then  escape  D  with 


full  probability  for  small  «).  Let  0  <  pj  <  |i,  <  p#.  Define  g„  - 
N^Ce),  r0  -  N^e)  -  N^e),  with  all  N^(e)  contained  in  D.  Define  the 
stopping  times  (o-.Tj)  by  r0  -  0  and 

en  -  inf  {  t  >■  tb  :  x‘(t)  c  ro) 

rn  -  inf  {  t  >  :  xe(t)  c  g0  U  (G-D)}, 


and  set  -  x*(rn  0  rj).  For  rotational  simplicity,  we  omit  the 

c-dependence  on  an,  Tn,  Zn.  We  have  (for  x  c  gp,  otherwise  we  add  a  term 
ExTr  which  is  bounded  uniformly  in  x  and  c,  to  (6.7)) 


(6.7) 


E  Tn  B 
TD 


J  E*  ^  c  g0)EZnJ<(Tn)  K+l  -  T„) 


The  theorem  will  be  proved  via  estimates  of  the  terms  in  (6.7). 
We  have,  for  x  c  g0. 


y 


inf  Ey,B  (*  )  (Tn+1  *  °t)  *  Ex,B  (t_)(t»+1  *  Tn) 
c  rp.oj.n  *  n  *  “ 


sup 

y  c  f^n 


By.»t«t>(T> 


n+1 


-  an> 


+  SUp  *  tb>- 

y  c  g^n  «  B 


(6.8) 
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It  can  be  shown  that  there  are  kj  >  0  (depending  on  the  14)  such  that  the 
left  side  of  (6.8)  is  bounded  below  by  kj  and  the  first  term  on  the  r.h.s.  is 
bounded  above  by  k,  (the  latter  fact  follows  from  an  argument  similar  to 
that  which  uses  Lemma  3  in  Part  3  below). 

Let  d  >  0.  Let  *(•)  denote  a  d/4-optimal  path  from  e  to  0N^(e), 
and  write  h  -  -  h,.  For  small  i\,  there  is  a  T  <  •,  (depending  on  14 

but  not  on  x)  such  that  for  each  x  c  N  (e),  there  is  a  path  <?(■)  taking 
x  to  e  with  (cost  <  d/4)  then  (using  the  first  part  of  *(•)  here)  e  to 
0N„  (e),  at  a  total  cost  no  greater  than  d/2.  Then,  there  is  an  e  >  0  such 
that  for  c  <  c  and  all  o,n 

(6.9)  P*,Bc(rn)  (d(x<(Tn  +  -).  •*(»  <  h/2)  >  exp  *  d/«  • 

The  c  can  be  chosen  to  be  independent  of  x  c  N„  (e),  although  we  omit 
the  details,  (The  argument  is  similar  to  that  used  below  to  get  the  uniform 
bound  on  the  terms  in  the  sum  in  (6.16).) 

Let  p  +  Tn  «  oB  denote  the  first  escape  time  into  r0  after  td. 
Then  (in  this  calculation,  we  let  x*(Tn)  c  g0,  but  for  simplicity  we  omit  the 
associated  notation) 


(6.10)  ExB  ,t  jp  (  T  1  Px  {p  >  mT)  . 

'  n  m=l 

By  (6.9) 
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Px*B€(Tn>{P  >  “T+  T>  "  E*.B((Tn{1‘PJ( 


!(Tn+mU  Be(rn 


{p  -  mT  <  T; 
+  mT) 


>mT) 


*  *  exP*  l(fi  >  mT) 

<  II  -  exp-  d/t)m+1. 

i 

I 


Thus  ^  jp  <  T  exp  d/t  for  small  c. 

i  Putting  these  estimates  together  yields  that  (up  to  a  multiplicative  factor 

t 

in  [kj.kj  +  Texp  d/a]  for  arbitrarily  small  d),  (6.7)  equals 

I  («•“)  l  Px(Zn  c  go}  , 

o 

which  wc  evaluate  next. 

I 

Cart  ,2.  Fix  d  >  0.  For  small  ^  there  are  ^  >  0,  tj  <  •  and  h  >  0 

such  that  for  each  x  c  g0  there  is  a  function  <f(.)  on  [O.tj]  connecting  x 
to  e,  then  e  to  al\  -  «Nh(D)  at  some  time  t*  <  tj  with  the  following 
properties:  *<(.)  satisfies  (6.1)  after  t8;  SG(tx,ix)  -  SQ{tv**)  <  S„(e)  +  d/4; 
the  distance  from  the  set  of  the  part  of  the  path  from  first  exit  of  r0 
to  first  reaching  dl\  is  >  B„;  the  distance  from  the  set  r0  of  the  part  of 
the  path  which  connects  x  to  e  is  a  $0.  A  similar  construction  was  used 
in  [10,  p  124].  The  fact  that  the  minimum  cost  for  hitting  al\  is  close  to 
that  for  hitting  aD  (for  small  h)  follows  from  the  last 
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assumption  stated  above  Theorem  5.  Let  8j  -  min  (B^h).  Then  there  is  an 
e0  >  0  such  that  for  c  <  e0  (we  compare  functions  on  the  interval  [0,tj] 
here)  and  x  «  g0, 

<6.12)  ^,B€(TB)fZn+I<D>> 

P*.B€(Tn)  <d<*C<Tn  +  ).  ♦*<  »  <  M 

>  exp  -  ISD(e)  +  d/2]  /c 


As  for  (6.9),  Cg  can  be  chosen  independently  of  x  c  We  have 


Px(Z-+i  *  “  *VTD 


n+1  €  loJP^  e  t0) 


Using  (6.12)  to  get  an  upper  bound  on  the  bracketed  terms  and  iterating  yields 


(6.13)  Px{ZB+l  «  So)  «  Cl  *  exp  -  (SD(e)  +  d/2)  /c]n+1 


which  yields  the  upper  bound  exp  [SD(e)  +  d/2]  /c  on  the  sum  in  (6.7) 
when  the  (tb+x  -  xB)  terms  are  dropped  from  the  sum). 

Part  3.  To  complete  the  proof,  we  need  the  following  Lemma,  whose  proof  is 
very  similar  to  that  of  Lemma  1.9  of  [10,  Chapter  6]  and  is  omitted. 


Lemma  3.  Ld  K  fca  a.  compact  ££i  UL  G  which  does  mi  contain  an 
entire  limit  set  for  (6.1),  and  let  r  denote  a.  stopping  time  for  x€().  Define 
»  min{t:  x*(t  +  t)  4  K.}.  Then  there  are  c  >  0,  T0  <  •,  «0  >  0  such 
that  for  c  <  cQ  and  all  y  c  K.  and  all  T 


Px.bc(t)  tTK  >  T) 


<  exp  -  c  (T  -  Tq)  /e 


for  all  t,» 


Continuing  with  the  proof  of  the  Theorem,  we  have  for  any  t,  < 


(6.14) 


Px,B€(Tn)^Zn+t  <  80)  “  Px,Be(rJ  ^Zb+1  <  D)  < 


sup  Py.B-(c  )  <Z»+1  *  D> 
y  c  r0,®  * 

<  SUP  py3«(oJ  <T«+r*.  >  **>' 
y  «  r0,» 


+  sup  PyJB  (c  )  ^tb+»'°b  *  *»’  ZB+I  (  D) 
y  «  r0*  * 

By  Lemma  3,  for  any  k,  <  •  ,  there  is  a  tg  <  •  such  that  for  small  « 
the  first  term  after  the  inequality  of  (6.14)  is  4  exp  -  k4/«.  Fix  d  >  0.  We 
next  show  that 


(6.15) 


Px,b€(0b)  (tb+1  *  °n  *  *»  »Zb+1  <  D)  < 

exp  *  (SD(e)  -  d)/i 


sup 

x  c  r0,« 

for  small  c  >  0. 

Let  sup  |(j|  <  k4.  The  set  Q  of  all  piecewise  linear  interpolations 
of  all  paths  xc(.)  on  [0,1,]  (over  all  c,  initial  conditions  i  i  Tg,  and  an 
arbitrary  k4-bounded  sequence  {((}  used  in  lieu  of  {((})  is  equicontinous. 
Let  Qx  denote  the  closure  of  the  subset  of  functions  in  Q  with  initial 
condition  x,  and  which  hit  6D  at  some  t  4  tr  For  small  14  and 

♦(•)  c  Qx  ,  SG(t,,*)  >  SD(e)  -  d/4.  Given  8  >  0,  there  are  Ng  <  •  (not 
depending  on  x  c  Tq)  and  {<f  (•),  i  4  Ng)  in  Qx  forming  a  8/4-net 

on  Qx.  Note  that  if  ^  -»  x  and  ^“(-)  -*  ♦(•)  ,  then  *(-)  c  Qx.  Now, 

n 

(6.16)  sup  Fx,Bs(<^)  ^tb+i*  ®b  *  *r  Zb+j  (  D) 

Ng 

4  E  sup  F x,B  (e  ){6(x* (•)>?(•))  4  8/2) 

>=1  ®  i  B 

We  now  show  that  the  r.h.s.  of  (6.16)  can  be  bounded  independently  of 
x  c  rff  By  (6.6)  for  each  x  c  r0  there  is  an  «(x)  >  0  such  that  for 
«  4  "i  (x) 

sup  Px3((eB)  «f<'»  4  R/2)  <  exp  -  (Sp(e)  -  d)/«  . 

If  inf  ?(x)  -  0,  then  there  are  x  c  Tv  M  ),  «m  -*  0,  xm  -  x,  {i  ),  4?*(  )  - 
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«(  •)  such  that  on  a  set  of  positive  probability  for  each  m, 

(6.17)  P^  (<Tn){d(x€m(  ),  ♦><•))  4  6/2)  >  exp  -  (SD(e)  -  d)/«m . 

m 

Again,  by  (6.6),  there  is  an  c  >  0  such  that  for  c  4  c  and  large  m  and 
all  o 

(6.18)  exp  -  (SD(e)  -  d/2)/«  >  (d(x  «(.),*(.))  <  6) 

>  P*  »  to  )<d(x«(-).^Tm(-))  <  6/2)  . 

nr  | '  b  9  in 

This  contradicts  (6.17).  Thus,  we  can  bound  the  r.h.s.  of  (6.16)  above  by 

(6.19)  N#  exp  -  (SD(0)  -  d)/« 

for  all  x,u,n  and  small  «. 

Define  v  -  min(n  :  Zn  4  g„).  For  small  c, 

Px(v  >  n  +  1)  -  i^t{Zj  <  Sq,  all  j  <  n  ♦  1) 

“  E*  >  ■) 

(6.20) 

*  [  inf  Py3,(oB)  <Z«+l  «*o)]Px  >  n> 

y  «  r^n 


>  (l-  exp.  (SD(e)  -  2d)  /€)»+1. 


This,  together  with  (6.7),  (6.13)  and  the  arbitrariness  of  d  yields  the  theorem 

since  the  E,^  (t0  )<Tn+i  *  T»>  valucs  lic  in  the  interval  [k*’k’  *  TcXP 
dj/c]  for  arbitrarily  small  dj,  as  shown  above.  Q.E.D. 
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7.  Remarks  and  Extensions. 

7-1  Exit  points  of  x*(.)  from  D.  Let  there  be  a  finite  number  of  points 
yr...,  yq  c  0D  such  that 

inf  inf  SC(T,*)  -  inf  inf  SG0», 

T>0  *cA,  T>0  +«A 

where  A,  «  {*(.)  e  CgfO.T]  :  *(T)  -  y,},  A  -  {*.)  *  CglO.T]:  *(T)  c  0D). 
Then,  as  in  [10],  for  each  x  c  D  and  s  >  0, 

( 

lim  Px{d(x4(r£),  ijy,)  <  8}  -  1. 

«-0  l 

7.2.  Global  behavior  of  x€( - )  on  [0,»j.  Assume  the  non-degenerate  case. 
Let  Kj,...,  Km  denote  a  collection  of  disjoint  compact  sets,  each  one  of  which 
is  a  limit  set  for  (1.2),  and  such  that  UK,  contains  all  the  limit  sets  for 
(1.2).  If  K,  n  ®G  i<  0,  let  K,  ■  0^,  a  single  point.  For  a  diffusion  with 
small  noise,  [10,  chapter  7]  obtains  the  (asymptotic)  probabilities  of  transition 
from  a  neighbohood  of  K,  to  one  of  Kj,  and  the  (asymptotic)  mean  times 

spent  in  a  neighborhood  of  any  subset  of  (K,,  i  <  m),  before  exiting  to  a 

neighborhood  of  another  subset  of  the  (K,,  i  4  m). 

Although  our  process  is  not  Markov,  similar  results  can  be  obtained  here. 
Let  g,  denote  a  ^-neighborhood  of  Kj  and  let  Fj  denote  the  set  N^tfCp 
-  NJ|j(KJ).  Define  r0  -  0,  eB  -  inf  {t  >  rB:  x*(t)  c  ^  r,)  and  tb  -  inf{t  > 
°b+i:  x*(*)  •  V  Set  Z,  ■  x*(rB).  Via  the  methods  in  the  last  section. 


one  can  get  upper  and  lo*ver  estimates  for  PxB  ^  ^{ZB+1  e  gj)  for 
x  c  gj .  These  would  then  be  used  to  obtain  the  results  of  [10,  Chapter  7] 
exactly  as  the  PX{ZB+1  <  gj}  are  used  in  the  Markov  process  case  of  that 
reference.  All  the  limit  expressions  carry  over,  with  use  of  our  action 
functional  Sq(T,«)  in  lieu  of  the  action  functional  S^*)  of  [10], 


7.3.  Stochastic  approximation.  Let  a,,  >  0,  an  -»  0,  Ea„  ■  ••  The  results  of 
Sections  1  to  5  can  be  carried  over  to  the  projected  stochastic  approximation 
(SA) 


Xj+1  “  MX;  +  ajtfXj.tj)), 


where  we  use  the  conditions  on  {(j)  and  b(-,-)  of  Section  2.  Define 

n-  1 

tn  -  £al  and  the  shifted  processes 
o 

Xj+i  ■  Il(j(Xj  4-  8jb(Xj  j  >  n,  XB  ■  x. 


*“(t)  -  XH1(t  -  tj  4-  tn)  -  Xj  (t  •  tm  -4  tn) 

*j4-l  "  *j 


on  [t,  -  tn,  t 


j+i 


t£  -  min  {t:  xn(t)  <  D). 

References  [8],  [13]  deal  with  the  (unprojected)  SA  problem  via  large 
deviations.  It  is  easy  to  incorporate  the  method  of  [8]  with  the  ’projected’ 
case  of  this  paper,  by  accounting  for  the  ’time  varying’  scaling  (aj.  We  cite 
only  one  result  (the  Kiefer-Wolfowitz  case  can  also  be  treated). 

For  a,,  -  1/n,  use  the  action  functional 


Sg(T,a)  “  f  eaL0(#g,#a)dst 


and  for  ^  ■  l/n®  a  <  1,  use  Sq(T,4)  ■  JjLG(4,,4,)ds.  Then  for 
A  c  go.T], 

-  inf  So(T,0)  <  lim  an  log  Px{xn()  c  A} 

♦ti4?  n 

<  fim  a_  log  PJx“(.)  c  A) 

n 

$  ■  i®f  SG(T,4)  . 
e>«  A 

Let  A  -  {<*>{•)  :  4(0)  *  x,  4(0  4  D,  some  t  <  T),  x  c  D,  where  we  define 
e  and  D  as  Section  6.  Then,  under  the  ’continuity’  condition  on  0D  just 
above  Theorem  5, 


lim  lim  a„  log  Px{t£  <  T)  -  -  in£  SG(T,4) 


x-e 


% 


!. 
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